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ABSTRACT 


Several  baseband  adaptive  communication  systems  are  presented  and 
analyzed  with  regard  to  "self-synchronization"  properties,  intersymbol 
interference,  and  convergence  properties.  Recent  convergence  results, 
the  proofs  of  which  are  contained  in  a companion  report,  are  applied  to 
provide  extremely  mild  "covariance  decay-rate  conditions"  for  which 
the  algorithms  treated  converge  with  probability  1.  Of  special  interest 
are  the  convergence  results  treating  correlated  cyclostationary  training 
data.  Recent  results  on  maximum-likelihood  sequence  estimation  are 
extended  to  treat  the  detection  of  general  "nonlinearly  modulated" 
digital  data  over  linear  dispersive  channels  and  nonwhite  additive  noise. 
Adaptive  techniques  for  training  the  new  detector  structure  are  proposed 
for  use  when  the  channel  and/or  the  noise  covariance  function  are  unknown. 
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I . INTRODUCTION 

The  large  amount  of  literature  treating  adaptive  digital  communi- 
cation over  an  unknown  dispersive  linear  channel  covers  a large  number 
of  adaptive  schemes.  Most  of  the  proposed  adaptive  schemes  make  use 
of  either  a form  of  stochastic  approximation  algorithm  or  the  least 
mean-square  (LMS)  algorithm  [12]  to  train  the  weight  vector  of  a 
transversal  filter.  Both  types  of  algorithms  can  be  called  "stochastic 
gradient-following  algorithms."  The  receiver  structures  that  have  been 
proposed  using  such  transversal  filters  have  run  the  range  from  adaptive 
linear  equalizers  to  nonlinear  decision-feedback  equalizers  to  combina- 
tions of  both  under  a wide  variety  of  "optimality"  criteria.  The 
excellent  review  paper  by  Lucky  [11]  reviews  the  major  structures  and 
philosophies  used. 

Convergence  properties  of  algorithms  used  for  adaptive  signal 
processing  do  not  appear  to  have  received  nearly  as  much  attention 
as  the  receiver  structures  themselves.  In  Chapter  II,  recent  conver- 
gence results  that  are  generally  applicable  to  most  algorithms  proposed 
for  adaptive  signal  processing  applications  are  presented.  The  proofs 
of  these  convergence  results  are  presented  in  a companion  report  [1]. 

In  Chapter  III,  two  adaptive  receiver  structures,  developed  at 
the  Naval  Undersea  Center,  San  Diego,  are  presented.  The  adaptive 
direct  channel  modeller  is  shown  to  converge  to  a cyclical  shift  of 
the  sum  of  contiguous  baud-length  portions  of  the  channel  unit  pulse 
response.  Such  a property  is  highly  desirable  in  that  it  leaves  the 
detector  performance  invariant  to  phase  differences  in  transmitter 
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and  receiver  clocks.  The  adaptive  inverse  channel  modeller  is  also 
shown  to  have  a kind  of  self  synchronization  feature.  Convergence 
properties  of  several  possible  algorithms  for  each  receiver  are 
established  using  the  results  of  Chapter  II.  Of  special  interest  are 
the  convergence  results  for  cyclostationary  training  data, 

showing  that  one  is  not  required  to  average  the  data  over  one  period 
before  iterating  the  algorithm. 

Chapter  IV  contains  a treatment  of  maximum- likelihood  sequence 
estimation.  This  problem  has  been  treated  previously  by  Forney  [9] 
for  pulse  amplitude  modulation  (PAM),  duration-limited  channels,  and 
additive  white  Gaussian  noise,  and  by  Ungerboeck  [10]  for  modulation 
schemer  which  can  be  treated  as  complex  PAM  (e.g.,  PAM,  phase  modula- 
tion, quadrature  modulation),  duration-limited  channels,  and  colored 
Gaussian  noise.  Ungerboeck  [10]  also  proposes  adaptive  procedures  for 
use  when  the  channel  impulse  response  and  the  noise  covariance  function 
are  unknown.  Both  Forney  and  Ungerboeck  make  use  of  some  form  of  the 
Viterbi  algorithm.  Magee  and  Proakis  [14]  propose  an  adaptive  decision- 
feedback  receiver  structure  to  estimate  the  discrete-time  channel 
response  and  incorporate  this  estimate  with  the  results  of  Forney. 
Qureshi  and  Newhall  [15]  propose  the  use  of  an  adaptive  equalizer  in 
cascade  with  a fixed  Viterbi  detector.  The  adaptive  equalizer  shortens 
the  duration  of  resulting  intersymbol  interference,  and  hence,  can 
result  in  a large  savings  in  computational  requirements  for  the  Viterbi 
detector.  In  Chapter  IV  of  the  present  work,  a maximum-likelihood 
sequence  estimator  is  developed  for  general  digital  modulation  schemes, 
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i.e.,  for  each  alphabet  symbol,  a different  waveform  is  transmitted. 

In  Chapter  IV,  such  schemes  are  called  "nonlinear  modulation" 
which  is,  perhaps,  a misleading  term.  In  any  case,  the  reader  should 
interpret  "nonlinear  modulation"  as  used  here  to  denote  modulation 
schemes  that  cannot  necessarily  be  treated  as  complex  PAM  modulation. 
Decision-directed  adaptive  procedures  for  implementing  adaptive 
maximum- likelihood  sequence  estimation  for  these  "nonlinear"  schemes 
when  the  channel  pulse  response  and  the  noise  covariance  function  are 
unknown  are  also  developed  in  Chapter  IV. 
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II.  SUMMARY  OF  RECENT  CONVERGENCE  RESULTS  FOR  STOCHASTIC 

APPROXIMATION  ALGORITHMS 

In  a recent  technical  report  [1]  the  author  presents  suffi- 
cient conditions  for  the  almost  sure  (a.s.)  convergence  of  a family 
of  stochastic  approximation  algorithms.  The  family  of  algorithms 
treated  includes  those  which  can  be  suggestively  called  "stochastic 
gradient- following  algorithms."  Most  algorithms  commonly  proposed 
for  use  in  adaptive  signal  processing  applications  are  included  in 
the  family  of  algorithms  treated  in  [1].  Notable  exceptions  are 
algorithms  which  employ  a "constant  gain  sequence."  The  only  con- 
vergence results  known  to  the  author  that  treat  such  algorithms  when 
the  training  data  is  correlated  are  those  of  Daniell  [2]  and  Kim  and 
Davisson  [3].  The  interested  reader  is  referred  to  [4]-[5]  for  a 
more  complete  discussion  of  these  points.  The  author  strongly  feels 
that  the  most  generally  applicable  convergence  results  for  algorithms 
suitable  for  adaptive  signal  processing  applications  when  the  training 
data  is  correlated  are  those  in  [1].  In  this  chapter,  the  results  of 
[1]  which  are  considered  to  be  more  practically  useful  are  presented. 
The  proofs  are  contained  in  fl]  and  will  not  be  repeated  here. 


Let  {W  },  W eRP  satisfy  the  recursion 
n n 1 

W . . = W + p (P  -F  W ) 
n+1  n n n n n 

for  n = 1,2,...,  where  Rp  denotes  the  real  p-dimensional  Euclidean 

space,  {p^}  is  a nonincreasing  sequence  of  positive  constants,  {P  }, 

P^eR*3,  is  a sequence  of  random  variables,  and  {F^}  is  a sequence  of 

real  symmetric  nonnegative  definite  pxp  random  matrices.  The  desired 

convergence  is  that  of  W to  w = R-1P,  where 

n o * 


R - lim  n Z E(F  ) , 
n-**>  £=1  1 


P - lim  n 1 ofi  E<Po>> 

n-*» 


and  E(*)  denotes  statistical  expectation.  The  type  of  convergence  of 
Wn  tQ  Wo  un<*er  consideration  is  almost  sure  (a.s.)  convergence.  If 

d S • 

wn  "*■  wq,  then  Pr(lim  W = w ) = 1,  where  Pr(A)  denotes  the  proba- 

n-*» 

bility  of  event  A.  Note  that  R and  P are  simply  the  "time  averages" 
°f  an<^  resPectively.  It  is  assumed  that  R is  positive 

-l  n+a 

definite  and  that  n £ E(F  ) converges  uniformly  to  R as  n-x*  for 

£=a+l 

all  positive  integers  a.  This  uniform  convergence  requirement  is 

easily  satisfied  if  E(F^)  is  either  constant  or  periodic.  It  is 

further  assumed  that  the  sequence  {p  } is  such  that  0<p  , <p  for 

n n+1  — n 

n=l,2,...,  and  0 < lim  np  < ®.  The  single  exception  to  this  final 

n-x» 

requirement  on  {p  } is  in  Theorem  8. 
n 

The  norm  of  a pxp  matrix  A,  denoted  by  | | A j | , is  defined  here  by 
I | A | | = max  |w'Aw| , where  weRP.  The  ijth  element  of  A is  denoted  by 

|w|=1 

(A)  . In  case  A is  symmetric  and  nonnegative  definite,  I I A j |=A  (A 
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where  X (A)  is  the  maximum  eigenvalue  of  A.  The  norm  of  weR  , 
max 

denoted  by  |w| , is  taken  to  be  |w|  = (w"w)  . 

The  convergence  results  presented  in  Section  II-B  involve  "covari' 
ance  decay  rate"  conditions  on  the  processes  {F^}  and  {P^}.  The 
required  covariance  functions  and  relationships  between  them  are  pre- 
sented here  for  ready  reference.  Define 


C,  = P,  - F.w  , 
k k k o’ 

(2.4) 

PF(kJl)  = E(FkF£)  - E(Fk)E(F^) , 

(2.5) 

pp(k,£)  = ECP^)  - E^E^), 

(2.6) 

and 

ppF(k,£)  = E(P'F£)  - E(Pk)E(Fp. 

(2.7) 

Then 

pc(k,£)  = E(CkC£)  - E(Ck)E(Cy 

(2.8) 

can  be  expressed  as 

pc(k,£)  = pp(k,i l)  + w'pF(k,&)wo  - ppF(k,Jl)wo  - ppF(Jt,k)wt. 

(2.9) 

Define 

PpFp(k,£,n)  = E((P'-E(Pk))F2(PrE(PJl))), 

(2.10) 

PFp(k,Jl,n)  - E((Fk-E(Fk))F^(Fr  E(F£))), 

ry 

(2.11) 

and 

(2.12) 

■ E((P'-E(P'))Fn(F|(-E(r|t))). 

Then 

PFC(k,«.,n)  = E((C'-E(C'))F^(CrE(CJ))) 

(2.13) 

can  te  expressed  as 
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PFC(k,£,n)  - PpFp (k,£,n)  + w'  PpF(k,£)wo  - ppFp(k, £ ,n)wQ 


-(3PFF^’^,n^Wo* 


(2.14) 


B*  Generally  Applicable  Convergence  Results 

The  reader  is  cautioned  against  assuming  that  all  of  the  conditions 


given  below  are  necessary  for  the  a.s.  convergence  of  W to  w . Proper 

no  v 


combinations  of  sufficient  conditons  are  presented  in  the  theorems 
below.  All  of  the  assumptions  made  in  Section  II-A  will  be  assumed  to 

hold  throughout  the  remainder  of  this  chapter,  with  the  single  exception 
of  Theorem  8. 

CONDITION  Al.  Define 


T 

YF(1l’i2’i3’14)  =,m?x  E<  * W"(E,  -E(F  ))w) 

Mj1  ,-x  S S 


* (2.15) 


Define  pk(a)  = a + [ka] , v^a)  = 1,  vfcfl(a)  = vfc(a)  + P^a).  and 


^(a)  (v^(a),  vjt(a)+l, . . . , v,  (a)-l),  for  k=l,2 where  a is 


kw'  ‘vk'“'*  V*"' vk+1’ 

positive  integer,  [ ] denotes  integer  part,  and  0 < a < 1.  Condition 

Al  is  that 


Z Pk4(a)  Z 


k=l  "■  i,j,£,m£J  (a) 


YF(i,  j ,£,m)<°° 


(2.16) 


for  some  a,  0 £ a < 1 and  for  some  positive  integer  a, 


CONDITION  A2.  There  exists  a real-valued  nonnegative  function 
f(i,j)  such  that  for  some  8 > ±,  |u| 6f (k,k+u)  is  uniformly  bounded  for 


all  nonnegative  integers  k and  u,  and  such  that 


( 2 2 

|YF<i.J,A,m)|  < f (i,j)f  (£,m)  + f (i,j)f (i,£)f (j ,m)f (£,*).  (2.17) 


CONDITION  A3.  For  some  v > 0, 


u"  max{ I IVk’k+u> i i , jpG(k,k+u) | , |pFC(k,k+u,n) | } 


is  uniformly  bounded  for  all  nonnegative  integers  k,  u,  and  n. 


CONDITION  A4.  The  quantity 


8 


n 


£ 

k=n 


\E<ck> 


exists  and  for  some  g > 1, 


£ 

n=l 


Un|SE(||Fn||S) 


< 00  # 


(2.18) 


(2.19) 


— -9REM  ^ Suppose  that  th*  structure  and  basic  assumptions  of 

Section  II-A  are  satisfied.  If  either  Condition  A1  or  A2  is  satisfied, 

and  if  both  Conditions  A3  and  A4  are  satisfied,  then  W V5’  w as  n— 

n o 

^PITIQN  A5.  The  sequence  {||fJ|}  is  a.s.  bounded  (in  n). 
^°SPITI°N  A6.  The  quantity  gn  given  by  (2.18)  exists. 

CONDITION  A7 , For  some  v > 0, 

uVmax  {| |pF(k,k+u)| | ,|pc(k,k+u)| } 
is  uniformly  bounded  for  all  nonnegativ-  integers  k and  u. 

THEOREM  2.  Suppose  that  the  structure  and  basic  assumptions  of 

Section  II-A  are  satisfied.  If  either  Condition  A1  or  A2  is  satisfied, 

and  if  Conditions  A5,  A6,  and  A7  are  all  satisfied,  then  W a4-8'  w as  n-*» 

n o 

CONDITION  A8.  For  some  v > 0,  uV|pp(k,k+u) | is  uniformly  bounded 
for  all  nonnegative  integers  k and  u. 

CONDITION  A9.  The  sequence  (F^  is  deterministic  and  the  quantity 


g 


n 


v 

t-i 

k=n 


"k<E(Pk>-  Vo) 


exists  and  F g + 0 as  n + 
li  n 


(2.20) 


THEOREM  3 . Suppose  that  the  structure  and  basic 


assumptions  of 
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Section  II-A  are  satisfied  If  Conditions  AS  and  A9  are  satisfied, 

dnd  if  p F ■>  0 as  n -*■  then  W -V  ‘ w as  n -*■ 
n n no 

CONDITION  A10.  The  sequence  (P^l  is  deterministic  and  the 

quantity 


£ WE(W 

k=n 


(2.21) 


exists  and  for  some  8 > 1, 


2 lsn|8E(||Fn||6)  < ». 

n=l 


(2.22) 


CONDITION  All.  For  some  v > 0,  uVmax{  | | pp(k,k+u)|  |,||ppF(k,k+u,n)  | | } 
is  uniformly  bounded  for  all  nonnegative  integers  k,u,  and  n. 

THEOREM  4,  Suppose  that  the  structure  and  basic  assumptions  of 

Section  II-A  are  satisfied.  If  either  Condition  AI  or  A2  is  satisfied, 

and  if  both  Conditions  A10  and  All  are  satisfied,  then  W a4-s’  w as 

n o 

n ->  °°. 

THEOREM  5.  Suppose  that  (F  } and  {P  } are  deterministic  and  that 
n n 

the  structure  and  basic  assumptions  of  Section  II-A  are  satisfied.  If 

Condition  A9  is  satisfied,  and  if  p F + 0 as  n ■*■  »,  then  W -*■  w as 

n n no 

n -*■  “. 


Definition.  A sequence  of  random  variables  is  said  to  be  M-depen- 

dent  if  for  all  index  sets  I and  J,  with  min  |n-m|  > M,  the  two  sets 

nel.mcJ 

of  random  variables  {y^inel}  and  {y^-.meJ}  are  statistically  independent. 

CONDITION  A12 . The  quantities  yF(k,k,k,k),  pp(k,k),  ppF(k,k), 
PpFp(k,k,k) , ppFF(k,k,k)  are  all  bounded  and  the  sequences  {F_}  and  (P„) 


are  M-dependent. 

00 

CONDITION  A13.  The  quantity  g given  by  (2.18)  exists  and  E |g  l^«». 

n , n 

n=l 
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THEOREM  6.  Suppose  that  the  structure  and  basic  assumptions 
of  Section  II-A  ate  satisfied.  If  Conditions  A12  and  A13  are  satisfied, 

£2  S 

then  W A ‘ w as  n •+ 
n o 

C.  Special  Families  of  F and  P 
n n_ 

Let  IX  } be  a sequence  of  R -valued  zero-mean  random  variables 
j 

r i 00 

and  let  is  } be  a sequence  of  real-valued  zero-mean  random  variables. 

j “°° 

Define  Rxj{(k,£)  = E(X,XJ),  Pg(k,£)  = ECs^),  and  pjk,*)  = E(sks£). 

Assume  that  R^Ck.k+u),  Ps(k,k+u)  and  pg(k,k+u)  are  all  periodic  in  k 
with  period  N for  all  integers  u. 

It  i9  helpful  in  what  follows  to  consider  a "typical"  problem  in 
adaptive  signal  processing.  Suppose  that  it  is  desired  to  choose  weRP 
to  minimize 

-1  N 2 
?(w)  = N Z E((s  - w\)Z). 

k=l  K * 

-1  N 

= N Z (p  (k,k)  - 2w'P  (k,k)  + w'R  (k,k)w  ) (2.23) 

k=l  s s xx 

2 

= a - 2w'P  + w'Rw, 
s * 

where 

-1  N 

R=N  Z R (k,k),  (2.24) 

k=l  xx 

and 

-1  N 

P = N x 1 P (k,k).  (2.25) 

k=l  S 

It  is  well  known  that  if  R is  positive  definite,  then  the  desired 

solution  is  w = R P.  Assume  now  that  R and/or  P are  unknown,  and 

that  it  is  desired  to  use  algorithm  (2.1),  with  F and  P functions  of 

n n 

the  observed  time  series  {X.}  and  {s,}.  Obvious  candidates  for  F and 

J J n 
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pn  which  satisfy  the  basic  structure  required  in  Section  II-A  art’ 


F 

n 


n 

Z 


j=n-K  +1 
J n 


(2.26) 


and 


.-1 


n 


P = K " Z s.X., 
n n . i i 

j=n-K  -1-1  J J 
n. 


(2.27) 


where  Kn  is  a positive  integer;  e.g.  K =1,  K,  N,  or  n. 


1‘  Let  Fn  and  pn  be  given  by  (2.26)  and  (2.27)  and  K = K, 

n n n 

a constant.  Suppose  the  entire  sequence  { satisfies  either 

k -1  _7 

yk  = + or  \ = a(k+b)  > where  a > 0,  b j>  0,  and  [ ] 

denotes  integer  part.  Suppose  further  that  llRxx(k,k)||  is  bounded 
(in  k).  Then  Condition  A4  is  satisfied  (with  6=2).  (Note  that 
Conditions  A6  and  A13  are  also  satisfied). 


Note  that  conditions  A1  through  A4  can  be  interpreted  as  quite 

mild  covariance  decay-rate"  conditions  on  the  sequences  {F  } and  {P  } 

n n 

Lemma  1 provides  several  choices  of  sequences  (pk>  in  order  to  satisfy 
Condition  A4.  In  case  either  N=1  or  Kn=N,  then  E(Cfc)  = 0,  and  Condi- 
tion A4  may  be  replaced  by  the  condition  that  E(||Fj|0)  is  bounded 
(in  n)  for  some  6 > 1.  In  view  of  the  widespread  use  of  algorithms 
fitting  the  framework  of  (2.1)  with  Fn  and  Pn  given  by  (2.26)  and  (2.27) 


] 2 


\ 


and  Kn=K,  it  is  worthwhile  to  establish  sufficient  conditions  directly 
on  the  sequences  {s^}  an<l  {X^}.  Such  conditions  are  readily  established 
when  {s^}  and  {X  } are  joint  discrete-time  Gaussian  random  processes. 


CONDITION  A14 . The  sequences  {s.}  and  { X . } are  jointly  normally 

distributed,  and  F and  P are  given  by  (2.26)  and  (2.27)  with  K =K 
n n n 

(a  constant). 


CONDITION  A15.  For  some  a > 
u“  max  j (R  (k,k+u))  | 

XX  1 , J 

il1 , jlp 

is  uniformly  bounded  for  all  nonnegative  integers  k and  u. 

CONDITION  Alb.  For  some  v > 0,  uV  max  { | p (k,k+u)|,|(P  (k,k+u))  | } 

1V3>  8 


is  uniformly  bounded  for  all  nonnegative  integers  k and  u. 

THEOREM  7.  Suppose  that  Fr  and  P^  are  given  by  (2.26)  and  (.".27) 
with  K^=K,  and  that  the  structure  and  basic  assumptions  of  Section  II-A 
are  satisfied.  If  the  conditions  stated  in  Lemma  1 are  satisfied,  and 
if  Conditions  A14,  A15,  and  A16  are  satisfied,  then  W a4-s'  w . 


D.  Remarks  and  Related  Results 

From  a practical  viewpoint.  Theorem  7 is  seemingly  of  great  signifi- 
cance. In  many  signal  processing  applications,  the  "Gaussian"  assumption 
is  often  well-founded,  in  view  of  the  central  limit  theorems.  In  this 

case,  for  the  family  of  algorithms  represented  by  (2.1)  with  F and  P 

n n 

given  by  (2.26)  and  (2.27)  and  K^=K  (a  constant),  essentially  all  one 
needs  to  verify  is  that  any  scalar  covariance  function  with  lag  u that 
one  can  compute  for  elements  of  (X^ } decays  more  rapidly  than  u~^,  and 
that  any  scalar  covariance  function  with  lag  u that  one  can  compute  for 
elements  of  {s^}  and  (X  } decays  at  least  as  rapidly  as  u"V,  for  some 
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v > 0.  It  is  indeed  difficult  to  imagine  any  stochastic  process 
having  a bounded  spectral  density  which  does  not  possess  this  desired 
property. 

The  observant  reader  will  have  noticed  that  the  family  of  algorithms 

represented  by  (2.1)  with  and  P^  given  by  (2.26)  and  (2.27)  and 

K =n  has  not  yet  been  treated.  In  this  case,  it  is  reasonable  to 
n ’ 

expect  that  F •>  R and  P a-Vs‘  P as  n <*>.  It  is  this  case  to 
n n 

which  the  following  theorem  is  addressed.  The  extremely  powerful 
results  of  Serfling  [ 6 ] — [ 7 ] can  be  applied  to  establish  conditions 
for  which  F^  a4.s>  k and  a4S‘  p.  g ee  ap£0  the  proof  of  Corollary 


(4.5)  in  [1]. 


THEOREM  8.  Suppose  that  there  exist  sequences  (a  } _ and  (b  1°°  . 

n n=l  n n=l 

of  nonnegative  real  numbers  (possibly  random)  satisfying  | | F^— r|  | a^Sa 

II  3 , s • 

Fnw0  - I*n  I b^.  Further,  suppose  that  there  exists  a positive 

integer  n (possibly  random)  such  that  for  ail  n > n , 0 < u (A  (R)-a  )<1. 

° —on  min  n — 

where  ^min(R)  denotes  the  minimum  eigenvalue  of  R.  Then  for  all  n > n , 


— o 


n 


lwn+rWo'-'Wn  -Wo'  11  (1"ykdk)+  maX  (bk/dk)(l-^(l-W.d  )), 

o k=n  n <k<n  i=n  J J 

n J n 


(2.28) 


“ .. 

where  d = A (R)-a  . Furthermore,  if  E p.  d.  a4s‘oo  and  b d 1 a4-s'  0 
k min  k . , k k - - 

k=l 


n n 


II  3 • s • 

W -w  -V  0 as  n ->  <*>. 
n o ' 


14 


Finally,  although  the  previous  results  have  treated  only  RP-valued 
algorithms,  a simple  trick  can  be  used  to  apply  the  above  results  to 
complex-valued  algorithms.  Consider 

(2.29) 


r I -i  = r + m(g  - ht), 

n+l  n n n n n ’ 


where  Hn  is  Hermitian  non-negative  definite.  Using  superscripts  r 
and  i to  denote  real  and  imaginary  parts,  respectively,  it  is  easily 
shown  that 


’cr 

frr " 

n 

+ U i 

|V“ 

n 

fa" 

n 

- H1 
n 

n 

n 

An 

r1 
- n_ 

G1 

n_ 

Hr 

n 

a 

(2.30) 


Consequently,  complex-valued  algorithms  such  as  (2.29)  with  H 

n 

Hermitian  can  be  put  into  the  form  of  (2.1)  with  real  and  symmetric 

by  making  use  of  (2.30).  Furthermore,  it  is  easily  shown  that  the 
resulting  F^  is  positive  definite  if  and  only  if  H^  is  positive 
definite. 
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III.  CHANNEL  ADAPTIVE  STRUCTURES 

In  this  chapter,  two  adaptive  communication  systems  developed  at 
the  Naval  Undersea  Center,  San  Diego,  California,  will  be  presented  and 
analyzed. 

A.  Basic  Communication  Scheme  and  Net,  ^tion 

Consider  the  communication  scheme  illustrated  in  Figure  3.1. 


w 

Waveform 

00 

^(O^s.  ,.(t-kT) 

k=0  W 

Channel 

y(t)^-v  r(t) 

Generator 

hc(T) 

n(t) 


Figure  3.1.  Generation  of  received  process  (r(t)}. 


The  information  sequence  {i^l  is  assumed  to  be  composed  of  elements  of 
the  M-ary  alphabet  {1,  2,...,  M}.  During  the  kth  baud  interval, 
te[kT, (kfl)T),  the  waveform  generator  transmits  the  signal  s . (t-kT). 
It  is  assumed  that  s^u)  = 0 for  all  u^[0,T)  Consequently,  the 
transmitted  signal,  2>(t),  can  be  express  as 


00 


4(t)  = I s . .(t-kT) 
k=0  1W 


Si([f])  ( (t)T} ’ 


(3.1) 


where  [ ] denotes  largest  integer  part  and  (t)T  = t modulo  T.  Note 

that  t = [— ] T + (t)^.  The  output,  y(t),  of  the  linear  time-invariant 
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channel,  h^x),  is  given  by 

CO 

y(t)  = / 4(t-T)hc(T)dt. 

—CO 

The  additive  noise  process,  (n(t)},  is  assumed  to  be  zero-mean  and 
wide-sense  stationary,  with  autocorrelation  p^(x)  = E(n (t)n  (t+x) ) . 
is  also  assumed  that  (y(t)}  and  (n(t)}  are  independent. 

Assuming  that  {4(t)}  and  {hc(x)}are  "approximately"  bandlimited 
CO  f£(_  ^ ’ 2D  ) anc*  C^at  hc(T)  is  "approximately"  duration  limited 
to  xe  [0, (L-l)D) , i.e.,  that  h (r)  = 0 for  all  xt [0, (L-l)D) , the  con- 
tinuous time  model  of  Figure  3.1  can  be  approximately  represented  by 
the  discrete-time  system  illustrated  in  Figure  3.2. 


Figure  3.2.  Discrete-time  model  of  received  process  {r  }. 


In  Figure  3.2,  h - h ((j-l)D),  j=l,2,...,L.  Consequently,  the 

J c 

discrete-time  received  process,  {r  },  can  be  expressed  as 

X/ 


r£  = y*  + n£  = ±lx  hiVi+l  + n£ 


= H'S£  + *£ 


(3.2) 


It 


(3.3) 


I ' r £—1 ’ * £-L+ly  » 

and  " denotes  matrix  transpose.  It  is  convenient  to  assume  that 
ND  = T,  where  N is  a positive  integer.  Define  the  received  data 
vector,  R , by 

X/ 

R£  = ^r£,r£-l’  * ‘ ‘ ,rj£-N+l^ 

The  conventional  correlation  receiver  for  estimating  the  information 
sequence  {ik>  is  illustrated  in  Figure  3.3. 


Figure  3.3.  Conventional  correlation  receiver. 

The  set  of  statistics  (y  (£)  : l<m<M}  is  determined  from  y U)  = 

m ’m 

R£  ^m,  ref  w^ere  §^ven  by  (3.6)  and  ref(^)  is  an  Nxl  refer- 

ence vector,  me { 1 , 2 , . . . , M } . The  set  of  reference  vectors  is  a 
function  of  the  channel,  h (•),  the  basic  signal  set,  (s  (*):ni=l  2 
and  the  noise  autocorrelation  function,  p (•). 
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B.  Direct  Model 

The  direct  model  incorporates  an  adaptive  receiver  structure 
that  establishes  a transversal  filter  model  for  the  communication 
channel.  The  basic  idea  is  to  apply  a periodic  reference  signal, 
Sref((£V’  comP°sed  of  elements  of  the  basic  signal  set  as  the  input 
of  an  adaptive  transversal  filter  having  N taps  and  tap  spacing  D. 

The  weight  vector  of  the  transversal  filter  is  then  trained  to  minimize 
the  average  mean-square  error  between  the  output  of  the  transversal 
filter,  zA,  and  the  received  data,  r£.  After  some  necessary  notation 

is  established,  several  highly  desirable  properties  of  this  scheme  are 
shown . 

1*  Notation  and  Basic  Properties 
Define  the  Nxl  vectors, 

SrefU)  = (sref((£V’  Sref{a_1)N) Sref ( (£_N+1)N) (3. 

and 


“ - <V  v2 


(3.8) 


The  output  of  the  transversal  filter, 


V 


can  thus  be  expressed  as 


= W'SrefW’ 

It  is  desired  to  find  the  W which  minimizes 
S(W>  = 2 E((rA  -z£)2), 

where  the  summation  is  over  any  N adjacent  integer  values  of  £. 

Assuming  that  ( i^ } and  (nk)  are  jointly  wide-sense  stationary,  E(y^) 

is  periodic  (in  £)  with  period  N.  From  (3.9),  (3.10),  and  (3.3),  we 
have 


(3.9) 


(3.10) 


C(w)  I E(r2)  - 2W'Srefa)E(y£)  + U\efU)S'e{U)V. 


(3.11) 
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The  desired  solution,  wq,  is  given  as  the  solution  to  the  set  of 
linear  equations 


Z S ,(£)  S'  (l)W  - X S ,(«)E(y,). 
j,  ref  ref  ref 

Assuming  that  Z sre£(^)  S'e^(£)  is  positive  definite  and  that 
£ SrefU)E(y£)  # 0,  then  a unique,  nontrivial  solution  to  (3.12) 
exists. 


Assume  that  Pr{i  = m}  = p for  m = 1,2 M.  Define 

k.  in 

Sm(£)  = (Sm((A)ND)»  Sm((£-1)ND)’---’sm(((A-N+1V))^ 

for  m = 1,2, ...  , M.  Then 


M 

E(4^)  = z E(4?  |ik=m)Pm  , kNj<£<  (k+l)N, 
m=l 

and  hence 


M 

E(S  (£))  = Z S (£)P  . 

*k  m=l  m ra 

Let  k be  a positive  integer  such  that  (<-l)N<L<^N  and  define 

Hi  ^hiN+l,hiN+2’  " ' ,h(i+l)N^  * 1= °> 1 » • • • » K_1 • 

Recall  that  h£  = 0 for  all  £ > L.  From  (3.3),  (3.14),  and  (3.15) 
we  have  that 


(3.12) 


(3.13) 


(3.14) 


(3.15) 


K-l  M 

E(y  ) = H'E(S  ) = Z H;  Z P S (£). 
£ £ . „ i , m m 


(3.16) 


Define  a forward  cyclical  shift  operator  such  that  for  all 

S = ^Slr  S2’  * ’ ’ ,SN^  ’ CNS  = ^SN’S1’  * * * ,SN-1^  ' It:  is  easily  seen  that 
the  NxN  matrix 


0 10  0 
0 0 10  0 


0 

1 0 


(3.17) 


f 


f 
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satisfies  the  desired  property.  Suppose  now  that 


S 


ref 


00 


M 

Z 

m=l 


m 


S (£+n.), 
m d 


(3.18) 


i.e.,  that  the  reference  vector  is  E(S.  (£+n,)). 

ik  d 

Here  n is  a fixed  but  unknown  shift  between  the  transm'tter  and 
d 

receiver  clocxs.  From  i3.17)  and  (3.18)  we  have 


n K 

srof(D  = c d z p s 00, 

ref  N , Kiii  m 

m=l 


Consequently,  (3.12)  can  be  expressed  as 
M M 


(3.19) 


nd  n M n,  n M M k-1 

cn  z r r Pm  sni  (0(C NVw=c  d z z p s 00  z p S'  (Z)  Z H.. 

* 0-1  m1=l  ml  ral  N N I m=l  m m m1=l  ml  ml  i=0  1 


(3.20) 


Under  certain  obvious  conditions  on  the  S (£) , equation  (3.20)  pro- 

m 

vides  the  important  result  that 

nd  K“1 
CM  W = Z H . . 

N i 


(3.21) 


Result  (3.21)  shows  that  the  solution  to  the  minimization  of  (3.10) 

k— 1 K-l 

is  a cyclical  shift  of  Z H . If  Z H.  = Hn  and  an  adaptive  algorithm 

i=0  1 i=l  1 U 

can  be  employed  to  converge  to  the  solution  of  (3.21)  without  prior 
knowledge  of  n^,  then  a self-synchronizing  adaptive  receiver  strucucre 
will  result.  Note  that  the  solution  to  (3.12)  expressed  by  (3.21) 
suggests  the  presence  of  intersymbol  interference  in  the  resultant 
detector  structure.  It  seems  feasible  to  incorporate  a baud-rate 
adaptive  transversal  filter  to  help  alleviate  the  performance  degrada- 
tion due  to  this  intersymbol  interference. 

2.  Adaptive  Direct  Model 


Consider  a recursive  algorithm  of  the  form 
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W , . - W - y (F  W -P  ) , 
n+1  n n n n n 


(3.22) 


for  n*l,2,...,  and  arbitrary.  Algorithm  (3.22)  is  identical 
with  (2.1)  and  hence,  convergence  results  presented  in  Chapter  II 
are  ideally  suited  to  the  adaptive  direct  model  processor.  In 
particular,  note  that  algorithms  of  the  form  of  (3.22)  with 

Fn*C.  ^ Sref«>  < 

J-n-K+l 

sref(j)  given  by  (3.19)  and 
-1  n 

P=K  T,  r.S  (j)  ( 

n n . , . j ref  J ' 

j=n-K  +1  J 
n 

satisfy  the  specialized  framework  established  by  (2.1),  (2.26),  and 

(2.27).  Of  course,  the  assumption  that  { s ^ } and  {X^}  are  zero  mean, 

made  in  Section  II-B,  is  violated.  Furthermore,  note  that  F given 

by  (3.23)  is  deterministic,  hence  Theorem  3 is  applicable. 

The  convergence  results  of  Chapter  II  are  now  applied  to  treat 

the  specialized  family  of  algorithms  represented  by  (3.22),  (3.23), 

and  (3.24).  The  assumptions  and  structure  of  the  preceding  subsection 

are  assumed  to  hold  throughout  this  subsection.  In  particular,  it  is 

assumed  that  w is  the  unique  solution  to  (3.12)  and  that  S ^(1)  is 
° ref 

given  by  (3.19).  In  other  words,  it  is  assumed  that  w = R_1P.  where 

o 


(3.23) 


(3.24) 


R = N ^ sref0)  s;ef0). 


(3.25) 


P'N  £,  Sre£«)E<V’ 

3=1  J 


(3.26) 


R is  positive  definite,  and  P 4 0. 

CONDITION  Bl.  The  entire  sequence  {y  } satisfies  either 

Iv 
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A 


= a([.^]  + b)  1 or  Pk  = a(k+b)  \ where  a > 0,  b > 0. 

CONDITION  B2.  For  some  v > 0,  uV I E^rnrk+u^  “ E(rk)E(rk+u)  I 

is  uniformly  bounded  for  all  nonnegative  integers  k and  u. 

CONDITION  B3.  The  sequence  {pfc>  satisfies  0 < Pk+1  < Pk  and 

0 < lim  kp,  < 00 . 
k-x*> 

THEOREM  9.  If  K = K (a  constant)  in  (3.23)  and  (3.24),  and  if 
n 

Conditions  B1  and  B2  are  satisfied,  then  a4-S>  wq  as  n ->  °°. 

THEOREM  10.  If  K = N in  (3.23)  and  (3.24),  and  if  Conditions  B2 
n 

and  B3  are  satisfied,  then  Wn  a4-s'  wq  as  n + “. 


With  these  results  established,  it  is  worthwhile  to  note  that 
Theorem  9 above  establishes  the  desired  convergence  result  for  algorithms 
of  the  form  of  (3.22)  that  have  the  least  demanding  computational  re- 
quirements, i.e.,  when  = 1.  In  this  case,  (3.22)  can  be  expressed 

as 


W , - W - p S (n) (z  -r  ) , 

n+1  n n ref  n n 


(3.32) 


where  z = S'  (n)W  is  the  output  of  a transversal  filter  having  the 
n ref  n 

"weight  vector"  and  input  3ref<<nV*  Even  thouSh  Theorem  9 pro- 
vides the  desired  convergence  result  for  (3.32/,  other  techniques  for 
solving  (3.12)  should  be  considered,  especially  when  convergence  rate 


is  an  overriding  issue. 

3.  Alternative  Procedures  for  the  Solution  of  (3.12) 


Note  that  R given  by  (3.25)  can  be  computed  a priori  and  that 


L M. 
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since  S ^(j)  periodic  with  period  N,  R is  a real  symmetric  Toeplitz 
matrix.  Since  efficient  computational  procedures  are  available  for 
solving  a Toeplitz  system  of  equations  (see  e.g.  [5]),  one  is  lead  to 
consider  the  following  procedure: 


P = P - 
n+1 


,7  (P  - r S c (n+1) ) 
n n+1  n n+1  ref  ’ 


(3.33) 


for  n-1,2,...,  and  = r^Sr£^(l).  Algorithm  (3.33)  i simply  a re- 
cursive formulation  of  the  usual  sample  mean  estimator  for  P,  i,e., 


n 


P = - E r.S  c(j). 
n n . , j ref  J 
j=l  J 


(3.34) 


Algorithm  (3.33)  also  fits  the  framework  of  (2.1)  by  letting  F =1, 

a ^ 

wn  = pn>  and  pn  = rn+iSref  • In  fact,  P^  3->s"  P as  n -*■  ® provided 

that  Condition  B2  above  is  satisfied.  It  is  the  author's  conjecture 

that  the  rate  of  convergence  of  (3.33)  is  far  more  rapid  than  that  of 

the  previously  considered  algorithms.  An  estimate  of  w^  can  be  computed 

by  solving 

R W * P 
n n 


f°r  W using  techniques  such  as  the  Levinson  algorithm  or  the  Trench 
algorithm  [5],  [8]. 

A discussion  of  hardware  implementations  of  such  a procedure  is  given 

in  [8].  Another  technique  is  to  use  the  Trench  algorithm  to  compute 

R and  then  estimate  w^  by  = R P . A disadvantage  to  this  approach 
-1  2 

is  that  R may  contain  as  many  as  N / 4 distinct  elements.  Still 

another  technique  is  to  compute  R a priori  and  use  (3.22)  with  F =R 

n 

and  P^  * P^  computed  by  (3.33).  This  final  scheme  makes  efficient  use 
of  available  a priori  information  and  has  only  a moderate  increase  in 
computational  requirements  over  (3.32).  Note  also  that  since  R is  an 
NxN  symmetric  Toeplitz  matrix,  only  N elements  of  R need  to  be  stored. 
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C.  Inverse  Model 

The  inverse  model  incorporates  an  adaptive  transversal  filter, 
the  input  of  which  is  the  discrete-time  received  sequence  {r^}.  The 
"weight  vector,"  W,  of  the  transversal  filter  is  trained  to  minimize 
the  average  mean-square  error  between  the  output  of  the  transversal 
filter,  z , and  a periodic  reference  signal,  s^^((£)^),  composed  of 
elements  of  the  basic  signal  set.  The  name  "inverse  model"  suggests 
that  the  adaptive  transversal  filter  attempts  to  undo  the  distortion 
caused  by  the  channel. 

1.  Notation  and  Basic  Properties 

Define  the  received  "data  vector"  by 

R = (t£’  r £-1  ’ ' ' ' *rJt-N1+l^  ’ 


(3.35) 


and  the  transversal  filter  weight  vector,  W,  by 
w = (w1,w2,. . . ,wN  )', 

so  that  the  output  of  the  transversal  filter  at  time  £ is 


It  is  desired  to  choose  W to  minimize 


(3.36) 


(3.37) 


C(W)  = Z E((z£  - 4ref  ((£)n))2),  (3.38) 

£ 

where  the  summation  is  over  any  N adjacent  integer  values  of  £. 

Assuming  th=>t  {i^}  and  {n^}  are  jointly  wide-sense  stationary,  E(R^), 

2 

E(R  R') , E(z  ),  and  E(z  ) are  all  periodic  with  period  N. 

At  At  At  At 


Defining 


R = N 


-1 


N 

Z E(R£R£} 
£=1 


(3.39) 
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and 


P = N_1  E <*ref  <<*>N)E(V’ 

£=1 


(3.40) 


and  assuming  that  R is  positive  definite  and  P ^ 0,  the  unique  minimum 


of  5(W)  is  achieved  when  W = wq,  where 

w = R_1P. 
o 

It  is  assumed  here  that  4 ^ ((£)^)  is 

given  by 

k 


(3.41) 


M 


(3.42) 


= E E(4  ji  = m)p  , kN  <£<(k+l)N. 

- 36  k.  m 

m=l 

The  solution  to  the  direct  model,  given  by  (3.21),  is  self- 
synchronizing and  the  relationship  between  the  solution,  wq,  and  the 
channel  unit  pulse  response  is  readily  established.  It  is  shown  below 
that  if  4ref  (a-n,)N)  in  place  of  ((i)N)  is  us*d  in  (3.38)  and  (3.40), 


c N 


rf 

then  the  resulting  solution,  w^,  is  not  a simple  cyclical  shift  of  w^ 
given  by  (3.41).  Results  concerning  the  relationship  between  w^  and 
the  channel  unit  pulse  response  are  not  readily  obtained  nor  interpreted 
and  hence,  will  not  be  included  here. 

Consider 

N 


P* ' 4ref  <<*"’=VE<V' 


(3.43) 


' S,  4ref<WVE(Rl+n  >' 

£=1  c 

where  the  last  equality  follows  from  the  periodicity  of  E(Rp  and  a 
change  of  summation  index.  From  (3.43),  (3.17),  and  (3.40),  it  is 


N 
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readily  seen  that 

* n 
* c 

p -'V 


Consequently,  the  unique  minimum  of 

5*(W)  - I E((V  iref((»-nc)N))2) 


is  achieved  when 


* -1  * -1  nc  -1  nc 

W = w = R P = R C P = R C R w . 
o Nj.  N-  o 


(3.44) 


(3.45) 


(3.46) 


Denote  the  ijth  element  of  the  real  symmetric  Toeplitz  matrix  R by 


Y | j | » it*1  element  of  P by  (P)^,  the  ith  element  of  w^  by  and 

the  ith  element  of  w^  by  w^.  Equation  (3.46)  implies  that 
* nc 

R w = C.T  P and  R w = P.  Consequently 
o Nx  o 


N 


1 

£ n 


N-, 


(3.47) 


for  i-1,2, . . . ,N^.  For  0^=1  and  1=2,  equation  (3.47)  implies  that 


YlWl+YoV*  • *+Y 


N1-2WN1=YoWl+YlV’  * •+YN1-1WN1- 


(3.48) 


Note  that  the  left-hand  side  of  (3.48)  involves  only  y^y^, . . . ,y^ 


while  the  right-hand  side  of  (3.48)  also  involves  y , . Consequently 

N -I 

w^  cannot,  in  general,  be  a simple  cyclical  shift  of  w^.  However,  a 
reasonable  conjecture  is  that  the  resulting  detector  performance  using  w* 
will  not  differ  markedly  from  that  using  wq  provided  that  is  large 
enough.  Unfortunately,  analytical  justification  for  this  conjecture 
is  not  available.  Stochastic  approximation  algorithms  for  the  solution 
of  (3.41)  are  dealt  with  below. 
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2 . Adaptive  Inverse  Model 
Consider  the  algorithm 


W 


i-i  = W - p (F  W - P ), 
n+1  n n n n n 


(3 


for  n“l,2,...,  and  arbitrary.  Suitable  choicee  for  F and  P 

1 n ii 

are  given  by 


F = K ' 
n n 


j=n-K  +1 
n 


r.r; 

3 j 


(3 


and 


n 


P = K 1 I t>  ((-j)  \ R , 

" " J-n-K  « tef  s J { 

n 

The  family  of  algorithms  represented  by  (3. 49)- (3 . 51) , of  course, 

satisfies  the  general  framework  of  Chapter  II. 

Convergence  results  of  Chapter  II  are  now  applied  to  the  family 

of  algorithms  represented  by  (3.41)  with  Fn  and  Pn  given  by  (3.50)  anc 

(3.51).  It  is  assured  that  Wq  = R_1p,  where  R is  given  by  (3.39)  and 

P is  given  by  (3.40).  In  case  -6  ((j-n  ) ) is  used  in  place  of  4 

rei  c IN  r6f 

((j)n)  (3*51),  all  statements  concerning  the  a.s.  convergence  of  W 

n 

to  w q must  be  replaced  by  equivalent  statements  concerning  the  a.s. 

convergence  of  ^ to  w*,  where  w*  = r“V  = R_1c"cp.  Furthermore, 

. . 1 

it  is  assumed  that  Kn  = K (a  constant)  in  (3.50)  and  (3.51).  The  case 
when  Kn  = n is  more  readily  ..andled  by  Theorem  8. 

CONDITION  Cl.  Define  fi  = r*  - E(r^),  and  Yr  (A1 , &2 , = 


max  | E ( tt  f )|.  Define  p = [k“], 

q-1  q k 

A -N.+Ki  <£ 
m 1 — m—  m 


.49) 


.50) 


.51) 


me  (1,2, 3, 4} 


and 
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V1,  Vk+1  W ai7d  Jk={vk,vk+1 Vk+1-1}’ 

0 <_  a < 1.  Condition  Cl  is  that 


for  k=l,2 


E 1 Y (i,j,£,m)  < «> 

k=l  i,j,£,meJk 


for  some  a,  0 < a < 1. 


(3.52) 


Note  that  Condition  A1  is  satisfied  if  Condition  Cl  is  satisfied 

and  Kn=K  (a  constant).  Since  r^  = y_^  + n^,  it  is  of  interest  to 

establish  conditions  on  the  random  sequences  (yj  and  (n^  which  ensure 

the  a. s.  convergence  of  Wn  to  wq.  Recall  that  {yi)  and  {n±}  are  assumed 

to  be  independent  and  that  E(n  ) = 0. 

LEMMA  2.  Define  g = y2  - E(y2)  and  h.  = n2  - E(n2).  If 
11  1 11  1 

E(n  n ...n  ) = 0 for  £=1,2 4,  then  Condition  Cl  will  be 

1 2 x2£-l 

satisfied  if 


£ Pt4  £ Yah(i,.i  ,^,m)  < »,  <.3.53) 

k=l  K i,j,£,meJkgn 


where 


Ygh(i,j,t,o)  = |E(g.8jgA)|  + lE^h.h^)! 

+ |E(giej)  | |E(hRf,m)  | + |E(gigjytym)||E<ntnm)| 

+ E(8iylym)l  |E(hjntn.)!  + lE<hiVtnm)HE<Vm)l 

+ lE<ykyJy]tym)llE("i"jVn,)  • (3-54) 

and  p , Jk  are  defined  as  in  Condition  Cl. 

Proof.  The  proof  follows  easily  by  expanding  E(f.f.f  f ) and 

l j £ m 


collecting  similar  terms  by  subscript  interchanges. 
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LEMMA  3.  Let  {g^},  {h^},  {n^},  pk>vk>  and  be  as  in  Lemma  2. 
Suppose  {yiJ  is  M-dependent,  then  Condition  Cl  will  be  satisfied  if 


vr1 


(3.55) 


where 


Ya,b=.fa  £fa  mfa  ( I E(hihjhjj,hm)  I + I ® (h^h. n^n^)  1^1  E(ninjn£.nm)  ^ } 


b b b b b 

+ S £ £ |E (h  n n . ) | +(b-3)  E Z ( | E(h.h. ) | + | E(n.n . ) | ) 

i=a  1= a j=a  J 16  1 i=a  j=a  1 J 1 J 


+ Z Z |E(hnJ)|.  (3.56) 

i^a  j=a  J 


Proof.  It  suffices  to  consider  {y^ } to  be  an  independent, 
identically  distributed  sequence.  Applying  this  fact  to  (3.54),  the 
desired  result  follows. 


Making  use  of  Lemma  3 and  techniques  given  in  [1],  Lemma  4 below 
can  be  proven. 

LEMMA  4.  If  {yj  is  M-dependent,  if  {nk>  is  jointly  normally 
distributed,  and  if  u6 | E(nknk+u) | is  uniformly  bounded  for  all  nonnega- 
tive integers  k and  u and  some  g > 1/2,  then  Condition  Cl  is  satisfied. 
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Next  consider  Condition  A3.  With  and  as  in  Lemma  1, 

r2±  - E(r2)  - g±  + h±  + 2yinit  (3.57) 

and  hence 

E(  (r2-E(r2))  (r2-  E(r2 ) ))-E(g±gj  )+E(h±h  )+4E(yiy  jE^n  ) . (3.58) 

Making  use  of  (3.58)  and  similar  arguments  for  pp,ppF>ppFp,pFF>  and  ppFF, 
the  conditions  stated  in  Lemma  4 can  be  shown  to  be  sufficient  for 

Condition  A3.  Theorems  11  and  12  can  thus  be  readily  established  from 

) 

these  comments  and  Theorem  1. 

Theorem  11.  If  the  conditions  of  Lemma  4 are  satisfied,  if 

Condition  B1  is  satisfied,  and  if  K =K  (a  constant),  then  W a-Vs’w  as 

n no 

n -*■  °°. 

Theorem  12.  If  the  conditions  of  Lemma  4 are  satisfied,  if  Condi- 
tion B3  is  satisfied,  and  if  K = N,  then  W a4-^'  w as  n ->• 

n no 


The  condition  that  (y^l  is  M-dependent  may  or  may  not  be  a valid 
assumption  in  practice.  If  the  information  sequence  {i  } is  M -depen- 
dent,  and  the  channel  unit  pulse  response  is  duration  limited,  then 
(yk>  is  M-dependent.  Theorem  1,  of  course,  can  also  be  applied  to 
cases  when  {yk>  is  not  M-dependent  and  {nfc}  is  not  a Gaussian  random 


process. 
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IV.  MAXIMUM-LIKELIHOOD  SEQUENCE  ESTIMATION  FOR  COMMUNICATION  SYSTEMS 
USING  NONLINEAR  MODULATION  SCHEMES 

In  recent  years  a large  amount  of  literature  has  appeared  treating 
nonlinear  receivers  for  the  detection  of  digital  data  transmitted  over 
a dispersive  linear  channel.  Theoretically,  the  most  exciting  develop- 
ment in  this  area  is  the  application  of  the  Viterbi  algorithm  to  the 
solution  of  the  problem  of  maximum-likelihood  sequence  estimation  [9], 
[10].  The  interested  .reader  is  referred  to  the  excellent  review  paper 
by  Lucky  [11]  for  an  overview  of  developments  in  this  area.  In  this 
chapter,  results  are  obtained  for  the  maximum-likelihood  sequence 
estimation  of  {i^}  for  the  transmission  model  of  Section  III-A.  The 
results  obtained  here  are  extensions  of  the  work  of  Ungerboeck  [10]  to 
nonlinear  modulation  schemes. 

A.  Maximum-Likelihood  Sequence  Estimation 

Recall  that  for  the  continuous  time  model  of  Section  III-A,  the 
received  process  {r(t)>  is  given  by  r(t)  = y(t)  + n(t).  Assume  that 
(n(t)  :-“><t<°°}  is  a zero-mean  stationary  Gaussian  random  process  with 
E(n(t)n(s) ) = pn(t-s).  Assume  that  on  the  basis  of  the  received  data 
(r(t):  0<t<fcT}  it  is  desired  to  find  a maximum- likelihood  estimate  of 

{i  . Assume  further  that  the  channel,  h (t),  as  well  as  the  noise 
k k=l 

covariance,  p (t),  are  known.  Furthermore,  assume  that  an  inverse  oper- 
n 

ator,  p \ exists  such  that 
n ’ 

IT 

I p _1  ( t , s) p (t-s)ds  = S(t),  ^ 

4 n n 

o 

where  6(t)  is  the  dirac  delta  function.  Note  that,  in  general,  p^ 
defends  on  IT,  and  that  p^X  cannot  be  interpreted  as  a linear 
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time- invariant  system,  even  when  (n(t)}  is  stationary.  The  log-likelihood 
function,  A(r,AT),  for  {r(t)  0 <t  < AT}  given  (y(t) :0£t<AT}  can  be  ex- 

pressed as  (within  an  additive  constant  and  a positive  multiplicative 
constant) 

1 £,T  -1  AT  *T 

£(r,£T)  - - j ] } r(t)p  n(t,s)r(s)dtds  + / / r(t)p  ^(t,s)y(s)dtds 

1 AT  AT  ° ° 

“ 2 / / (t , s)y (s)dtds.  (4.2) 

0 0 

The  sequence  (i^}^_^  is  considered  to  be  the  maximum -likelihood  esti- 

A 

mate  of  {i^}^^  if  (y(t)  :0£t£AT}  is  the  signal  component  of  the  received 

A 0 

process  corresponding  to  the  information  sequence  {i,  }.  , and  this 

k k=l 

A £ 

{ik>k=1  results  in  the  maximum  of  A(r , AT)  of  any  allowable  sequence 
£ 

^k^k'l'  Since  the  maximization  of  A (r , AT)  is  unaffected  by  the  first 
term  on  the  right  hand  side  of  (4.2),  it  is  sufficient  to  maximize 


AT  AT  AT  AT 

A (r , AT)  = / / r(t)p  n(t,s)y(s)dtds  - — / / y(t)p  *(t,s)y(s)dtds. 


0 0 


0 0 


(4.3) 


Define  y (t)  to  be  the  output  of  the  channel  when  the  transmitted 

signal  is  s (t) , i.e. , 
ra 

min  {t,T} 

y (t)  = / s (x)h  (t-i)dt, 

m max  {t-LT,0}  m c 


(4.4) 


where  it  is  assumed  that  h (t)  =0  for  all  t > LT  and  for  all  t < 0. 


Note  that  yra(t)  - 0 whenever  ti [0, (L+1)T] , and  that  by  superposition. 


y(t)  = 2 V (t-kT). 
k=0  m(k) 


Defining 


e„<k) 

for  0 < t < 


1 , if  m(k)  = m 
0,  otherwise, 
AT, 


A M 

y(t)  = Z E 3 (k)y  (t-kT)  . 

n i ® ® 


(4.5) 


(4.6) 


(4.7) 


\ 
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Define 


JLT 


Z r #(t)  - / p“  (t,s)ym(s-kT)ds, 

m»k»*-  o 


?T 


ra,k,  l 


and 


OT 

r . . - 1 ya  <‘-kiT)zm,k/t)dt- 

V^k^m.k.fc  O 1 


(4.8) 

(4.9) 

(4.10) 


Note  that  from  (4. 8)- (4 . 10)  , ’V^k.^m.k, i ^m.k.m^k.^  £• 

Prom  the  above  definitions,  (A.3)  can  he  expressed^ 

1 M Q ...  _I  I z s 1 3m(k)^n  (VV  .kl’m,k’*' 

g,*  (r , £T)  = 1 E 6m^k^am,k,J.  2 k=0  m=1  k =0  m^l  1 

k=0m=1  1 (4.11) 

Note  that  the  Z's  and  the  Vs  ate  constants  which  may  he  counted 
a p^  and  that  ^,0^0  is  a set  of  sof f icient  stat  s^ 

tics  for  tik)‘«r  Furthermore,  note  that  t S ven 

t to  to  (k):l<m<M,0<k<t)  is  equivalent  to  maximiz  ng 
with  respect  to  lPmW  

given  by  (4.3)  with  respect  to  Uk>k,0 

To  he  of  any  practical  consequence  for  most  digital  co»unicatio 

applications,  a recursive  procedure  for  seizing  l*  is  necessary.  In 

general,  a recursive  relationship  (in  1)  for  l*  is  impossible  unless^ 

either  a recursive  relationship  exists  for  um>k>t  and  ^.m.k.r 


am,k,i  ” Vk 


and 


(4.12) 


(4.13) 


^.k^m.k.i  = \.kr"«.k’ 

u vHnt-  5 is  to  be  interpreted  as  being 
where  the  absence  of  the  su  sc  P 
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independent  of  £.  The  latter  approach  ((4.12)  and  (4.13))  is  taken 
here,  in  analogy  with  the  assumptions  made  by  Ungerboeck  [10].  With 
these  assumptions,  and  using  the  symmetry  of  the  y's. 


M 


z (r,(£+l)T)  = £ (r,£T)  + Z 0 (£+l)a 

m i 


m=l 


m,  £+1 


M M 


~ Z E 3 (£+1)0  (£+l)y 

m=l  m1=l  mi  m 

H M M 
£ £ £ 0 (k)0  (£+l)y 

k=0  m=l  m^=l  ml 


m^,£+l,m,£+l 


m^,£+l,m,k 


Note  that  the  number  of  computations  needed  to  compute  £*(r,(£+l)T) 
grows  linearly  with  £.  Assuming  that  p“1(t,i)=0  whenever  | t-s | >L1T, 
from  (4. 8)- (4. 10)  it  can  be  shown  that 

Zm,k(t)  = 0 whenever  ti[  (k-L^T,  (L+I^+k+DT] , 

and 

Ymi , kx ,m, k = 0 whenever  | k-^ | >1*1^ . 


Substituting  (4.15)  and  (4.16)  into  (4.14),  £*  can  be  written  as 
**(r, (£+l)T)  = £*(r,£T) 


M 

+ £ 0 (£+-l){a 
m=l  m 


m,  £+1 


i 

2^ m , £+1 , m , £+1 


£ 

Z 

k=£-L-L  +1 


M 

Z 

m^l 


3 (k)  y 

m 


m, £+1 ,m^ ,kJ 


where  use  has  been  made  of  the  properties  of  em(k).  The  result  (4.17) 
represents  an  extension  of  equation  (27)  of  Ungerboeck  [10]  to  general 
nonlinear  modulation  schemes. 

Following  Ungerboeck  [10],  a modified  Viterbi  algorithm  can  be 
applied  to  (4.17)  to  obtain  a maximum-likelihood  sequence  estimate. 


(4.14) 

(4.15) 

(4.16) 

(4.17) 
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A £ £ 

{^•^=1  ^or  ^k^k=l‘  ^ecaH  that  estimating  ^(k)  is  equivalent  to 

estimating  m(k)  or  i . Assume  that  {i  )is  a coded  sequence  with  y,  the 

k k j 

state  of  the  coder  after  ij  has  been  transmitted.  Assume  further  that 

the  sequence  of  states  {y^}  is  a Markov  sequence,  i.e.,  that 

Pr (y . | y ._^,y . = Pr(y.|y.  ^).  Given  y.  and  an  allowable  sequence 

J J *1  3 J 3 

{ij+^,ij+2»  • • • »ij+]c}»  the  state  y^+^  is  uniquely  determined.  Consider 

the  so-called  survivor  metric 


^£,k^°£  k^  = maX  ^£*<r>JlT)K 

{ " * ’^-k-l^Jt-k^Jt-k 


(4.18) 


where  o^  ^ = (y^_^;i^_^+^  ...  i^)  is  called  the  survivor  state  and  the 

£-k 

maximum  is  taken  over  all  allowable  sequences  {i.}  that  put  the 

3 i=i 

coder  into  state  y • Note  that  Pr (a.  , | o ,o  . ..)  = 

Pr(o^  k)  i.e.,  that  {a ^ is  a Markov  sequence.  Associated 

with  each  o is  at  least  one  path  history  {...,i  i*  } which 

36  j R 36“R“*1  y 36-k 

yields  the  maximum  in  (A. 18).  Ungerboeck  [10]  claims  that  this  maxi- 
mizing path  history  is  unique.  Define 


M 


,1 


M 


J 11 


(A. 19) 


then 


* * M 
l (r,(£+l)T)  = £ (r,£T)  + Z g (£+l)a  + J . 

, m m , £+1  £ , £+1 

m=l  ’ 


(A. 20) 


Note  that  the  only  elements  of  {g  ( j ) } which  appear  explicitly  in 


m 


(A. 19)  and  (A. 20)  are  ( 8m(j  ) : l£m<M, £-L-L^+l<jj<£+l}..  Recall  that  a 

one-to-one  relationship  exists  between  i.  and  g (j).  Consequently, 

) m 

letting  k = L+L^+l  and  substituting  (A .l'  ) into  (A. 18), 


M 

^£+1 , L+L  +1  (a£+l  L+L  +1)  = raax  {£*<r,W)  + Jo  ,+1>  + Z 6 (£+1) 

1 1 m ' m,£+l 


M 

= m!16m(£+1)o,m,£+i  + max{£*(r,£T)  + J£j£+1> 


£ , L+Lj+1  a£+i,L+L1+l 


M 


^ V£+1)a„,  04.1  + max{£  (a  i + i 

m=l  m m’£+1  A.L+L  +l'ajyw-L  +1}  + J£>£+1 


(4,21) 

) + Jo 


£,L+L  +1  -f-  a 

1 £+1 , L+L.j+1 

For  each  £,  survivor  metrics  and  path  histories  must  be  calculated  for 
all  possible  states  <*AfL+I^+r  One  would  expect  that  the  further  one 
looks  back  from  £,  all  path  histories  would  tend  to  be  Identical.  In 
practice,  one  would  choose  the  path  history  with  the  largest  survivor 
trie  for  some  £-L  . If  one  chooses  L*  large  enough,  then  the 
difference  in  performance  between  allowing  an  unbounded  delay  and  a 

delay  of  L is  negligible.  Apparently,  analytical  guidelines  for  the 
choice  of  L is  an  open  issue. 
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B.  Adaptive  Maximum-Likelihood  Sequence  Estimation 

In  order  to  utilize  the  maximum-likelihood  sequence  estimation 
algorithm  represented  by  (4.21),  the  channel,  {h^i):  0<r<LT},  and 
the  inverse  kernel  (p^Ct.s) : | t-s  j<L^T}  are  needed.  In  this  section, 
adaptive  techniques  for  approximating  h£  and  p ^ are  presented. 

First,  consider  adaptively  estimating  the  channel,  h . Recall 
that  r(t)  = y(t)  + n(t),  and  that 


00  00  M 

y(t)  = £ y (t-kT)  = £ £ 3m  nc)y(t-kT) . 

k=0  m(k)  k=0  m=l  m m 


(4.22) 


If  the  channel  were  known  and  if  the  transmitter  and  receiver  clocks 

were  synchronized,  then  the  signal  components  (y  ( * ) : l<m<M}  could  be 

m 

generated  at  the  receiver  using  (4.4).  In  this  case,  y(t)  could  be 
estimated  as 


00  M . 

y(t)  = £ 13.  (k)y  (t-kT) , (4.23) 

, n , m m 
k=0  m«  l 

where  (3  (k) } is  an  estimate  of  (3  (k) } obtained  from  the  modified 
m m 

Viterbi  algorithm. 

Since  the  channel  is  unknown,  consider  an  NL-tap  transversal 

NL 

filter  with  tap  spacing  D,  tap  weights  {w, } , . , and  input  s ((j-n  )D), 

1 i-1  me 

where  is  an  integer  used  to  denote  the  unknown  timing  relationship 
between  the  transmitter  and  receiver  clocks.  Let  ND  = T and  consider 


y (jD) 

m 


NL 

£ s ( (j-i+n  +l)D)w. 
i-1  “ C 1 


(4.24) 


as  an  estimate  of  ym(jD).  With  the  adaptive  direct  channel  model  of 
Section  III-B  as  motivation,  consider 
00  M 

y(jP)  = £ £ 3 (k)y  (jD-kT) , 

i m m 


(4.25) 
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NL 

as  an  estimate  of  y(jD),  with  {w^}_^_^ 

N "2 

S(W)  = N E E((r(jD)-y(JD))  ), 

1-1 


chosen  to  minimize 


where 


W = (w1,w2>. . . ,wNL) 


Define 

S . .-(s  ((J4n  )D-kT) , . . . ,s  ((j-NL+n  +l)D-k?)) " 
m,  j , k m c l c 


and 


Z. 

J 


OT  M 

E E 
k=0  m=l 


e (k)  S 
m 


m,j ,k‘ 


(4.26) 


(4.27) 


(4.28) 


(4.29) 


Then 

y(jD)  = U'Z.,  (4.30) 

and  the  function  to  be  minimized  is 

N N N 

€(W)  = £ E E(rb  - | U'  E E(r  Z . ) + hi'  E E(Z  Z/)W,  (4.31) 

j=l  3 "j-1  33  j-1  3 3 

where  r^  = r(jD). 

Assuming  that  (i^)  and  (n(t)}  are  jointly  wide-sense  stationary, 

E(r  (jE))  is  periodic  (in  j)  with  period  N.  If  8m(k)  = 8m(k) , then 
E(r^Zj),  E(y  (jD)),  and  E(Z^Z')  are  also  periodic  (in  j)  with  period 
N.  It  is  assumed  in  the  following  that  these  periodicities  are  present. 

If 

-1  N 

R = N E E(Z.ZD  (4.32) 

J-1  3 3 

is  positive  definite  and 
-1  N 

P = N E E(r . Z . ) (4.33) 

j-1  3 3 

is  nonzero,  then  the  unique  minimizing  weight  vector,  Wq,  is  given  by 

w - R_1P. 
o 


3 5 


Let  {h^}^  be  the  weighting  sequence  for  a transversal  filter 

model  of  the  channel,  h , with  tap  spacing  D.  Then  y (jD)  (from  (4.4)) 

c m 


can  be  approximated  as 
NL 

y (jD)  = £ s ((j-i+l)D)h  . 

m . , m l 

1=1 


(4.34) 


Defining 


H (h-^  * k2  * * * * * h^)  * 


(4.35) 


then 


and 


y (jD  - kT)  = S'  . ,H, 

-’m  J m,  j-n  ,k 

c 


oo  M 

y (jD)  ~ Z 1 em(k)Sm  i-n  kH‘ 
k=0  m-1  m m,J  c*k 


Now,  assuming  that  3 (k)  « 3 (k)  and  that  (3  (k)} is  approximately 

mm  m 

independent  of  (n(t)}, 


y(jD) 


3-^ 


H, 


and 


E(r.Z.)  ~ E(Z.Z;  )H. 

J ] J J~nc 


(4.36) 


(4.37) 


(4.38) 


(4.39) 


Consequently,  from  (4.32),  (4.33),  and  wq  = R P , wq  is  an  approximate 
solution  to 


N N 

Z E(Z.z:)W  = Z E(z.z:  )H. 

j-1  J J j=l  J J_nc 


(4.40) 


Define  { z^}  by  z^  = (Z^)^.  By  assumption,  {z^}  is  a "wide-sense 
cyclostationary  process,"  i.e.,  Efz^z^^)  Peri°dic  (^n  *0  with 
period  N.  Defining 


P2(u)  - j E(z  z ), 

£=1 


(4.41) 
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and  letting  ■ (W) ^ , (4.40)  may  he  expressed  as 
NL  NL 

Z p (i-j)w,  = E p (i-J-n  )h  , i“l,2, . . . ,NL.  (4.42) 

J-l  2 J 3-1  " ° J 

Equation  (4.42)  suggests  that  {w^.  } is  a shifted  version  of  {h,.}.  For 

example,  suppose  that  NL  >>  nc>l>  and  that  h^  = 0 for  NL-n^+l^j <NL. 

By  assumption,  the  solution  {w_. } to  (4.42)  is  unique.  Consequently, 
since 

h.  , n +1 <1 <NL  (4.43) 

i-n  c 

w = c 

J 0 . ^lnc 

satisfies  (4.42),  it  must  be  the  unique  solution.  These  remarks  offer 
strong  justification  to  the  claim  that  if  In  |<<NL,  then  w = R P is 
approximately  a shifted  version  of  the  discrete-time  channel  model, 

H.  The  resulting  shift  is,  of  course,  desired  in  order  to  line  up 
the  receiver-generated  estimate  of  the  received  process  with  the 
actual  received  process.  This  property  is  quite  similar  to  that  of 
the  direct  channel  del  of  Section  II-B,  except  the  present  scheme 
exhibits  no  intersymbol  interference. 

Suitable  algorithms  for  estimating  w^  = R ^P,  with  R and  P given 
by  (4.32)  and  (4.33),  are  represented  by 


W i = W 
n+1  n 


+ W (P  - 

n n 


F W ), 
n n 


where 


n 


E 

i-n-K  +1 
n 


Z^’ 


(4.44) 


(4.45) 


and 


n 


E 

£=n-K  +1 
n 


VlZl' 


(4.46) 


Such  algorithms  fit  the  general  framework  presented  in  Chapter  II, 
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and  hence,  the  convergence  results  of  Chapter  II  are  applicable  to 

A 

(4.44)~(4.46).  If  g (k)  = g (k) , then  Theorem  2 of  Chapter  II  is 

mm  r 

applicable  to  obtain  rather  mild  conditions  for  which  W alS’  w . The 

n o 

occurrence  of  g (k)  in  (4.29)  represents  what  is  commonly  called 
"decision-feedback."  Such  schemes  always  have  a finite  probability 
of  a "runaway." 

Now  consider  the  problem  of  estimating  the  inverse  kernel,  p 

n 

Suppose  that  a time-invariant  approximate  whitening  filter,  h (t),  for 
(n(t)}  exists.  Then,  denoting  the  output  of  the  whitening  filter 
by  n* (t) , the  spectral  density  of  n*(t)  by  S*(f),  the  spectral  density 
of  n (t)  by  S (f),  and  the  transfer  function  for  h (t)  by  H (f),  we 
have 

1 - S*(f)  = Sn(f)  |Hw(f)|2.  (4.47) 

Consequently, 

p‘V,’)  = rl  ■ rl(|H„<£)|2),  (4.48) 

nv 

where  F.  1 denotes  inverse  Fourier  transform.  From  (4.48)  it  is 

apparent  that  h convolved  with  itself  is  an  approximation  to  p . Hence, 
w n 

via  (4.48),  an  approximation  of  h^  will  result  in  an  approximation  to 

-1 

p as  well, 
n 

Recall  that  r(t)  = y(t)  + n(t).  Using  y(jD)  from  (4.25),  an 
estimate  of  n(jD)  is  given  by 

n(jD)  = r (jD)  - y(jD).  (4.49) 

Techniques  analagous  to  those  proposed,  e.g.,  by  Widrow  et.  at.  [13] 

wilJ  now  be  used  to  obtain  an  adaptive  transversal  filter  approximation 

to  h . Define 
w 

- (n(U-l)D),  n((£-2)D),...,n((A-p)D))'  > (4.50) 
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and 

W = (w  ,w  .... ,w  )'. 

12  p 

Define 

n*  = n(£D)  - W'N^, 

and  consider  choosing  W to  minimize 
S(W)  = 2((nJ)2). 


Defining 


Rnn  = E(W’ 


(4.51) 


(4.52) 


(4.53) 


(4.54) 


and 


p = KW’ 


(4.55) 


the  minimizing  weight  vector,  wq,  is  given  by 

v = R_1  P, 
o nn 


(4.56) 


assuming  that  R is  positive  definite.  The  main  idea  is  that 
nn 

A 

WoN£  is  ttie  kest  (in  the  minimum  mean-squared  error  sense)  linear 
predictor  of  n(£D)  based  on  the  previous  p samples,  N . Subtracting 

JO 

this  prediction  of  n(£D)  from  n(£D),  should  then  tend  to  "whiten"  the 
•k  -1 

residual,  n..  The  resulting  approximation  to  K is  a transversal 

x,  n 

filter  with  unit  pulse  response  eg  , with 


r 


1,  u=0 


-2w^,  u=l 
u-1 


8u  = < 


-2w  + E w w , 2<u<p 
i=l  1 u 1 


l w.w  . , p+l<u<2p 

„ i u-i  r 

l-u-p 


(4.57) 


I 0,  elsewhere. 


where  c is  a constant  used  to  normalize  (4.52)  so  that  n^  has  unit 


variance.  Of  course,  a suitable  estimate  for  c is 

£ 


, 1 ~ *.2.-1 
ct  ■ t . .£.(.ni  > 


£ j=£-K„+l 
J £ 


where  K =1,  K,  or  £,  for  example. 


Suitable  algorithms  for  approximating  w^  given  by  (4.56)  are 


represented  by 


W L = W + y (P  -F  W ), 
n+1  n n n n n 


with 


n 

f = k A e n.n: 

n n . £ £ 

J-n-K+l 


and 


-1  ** 

P - K X £ n N . 
n n 

3-n-Kn+l 


In  case  n(£D)  = n(£D),  then  from  Chapter  II,  Theorem  7,  W 


a.  s. 


n 


w 

o 


provided  that  for  some  v>y-,  uV  I p (t,t-hi)|  is  uniformly  bounded  for 

H ' n 


all  real  t and  all  non-negative  integers  u,  and  that  (y  } satisfies 

n 


Condition  B3. 


/A 
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V.  CONCLUSION 

Several  adaptive  communication  systems  have  been  presented  and 
analyzed.  The  analysis  has  included  convergence  properties  and  various 
kinds  of  "self-synchronization.”  In  many  applications,  e.g.,  transmission 
of  digital  data  over  the  undersea  channel,  intersymbol  interference 
becomes  an  important  problem.  For  such  applications,  the  adaptive 
maximum- likelihood  sequence  estimation  schemes  presented  in  Chapter  IV 
are  especially  appropriate. 

The  results  presented  in  Chapter  IV,  which  are  believed  to  be  new, 
require  additional  analytical  investij  ations  as  well  as  simulation 
studies,  before  corresponding  practical  systems  can  realistically 
be  proposed.  Of  primary  importance  are  the  convergence  properties  of 
decision-directed  algorithms  and  suitable  initialization  procedures. 
Decision-directed  algorithms  suffer  from  a phenomenon  called  a 
"runaway."  A runaway  occurs  when  the  detector  makes  a sequence  of 
erroneous  decisions,  which  degrades  the  parameter  estimates,  which  in 
turn  further  degrade  the  detector  performance.  The  only  analytical 
results  treating  this  problem  of  which  this  author  is  aware  are  these 
of  Davisson  and  Schwartz  [16].  A possible  method  of  reducing  the 
probability  of  a runaway  is  to  incorporate  a "null-zone"  detection 
scheme  into  the  decision-feedback  loop.  A combination  of  the  null-zone 
decision-feedback  scheme  of  Gitlin  and  Ho  [17]  and  the  results  of 
Chapter  IV  could  well  lead  to  a practical  "high  performance"  receiver 
for  "nonlinear"  modulation  schemes. 


Closely  related  to  the  techniques  presented  in  Chapter  III  are 
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adaptive  reference  estimators  [19]  used  in  conjunction  with  a pre- 
distorted-replica  correlation  receiver  [18].  Such  techniques,  which 
make  use  of  decision-feedback  strategies,  suffer  a degradation  in 
performance  due  to  intersymbol  interference  in  a manner  quite  similar 
to  the  schemes  presented  in  Chapter  III.  Furthermore,  such  schemes 
exhibit  a nonzero  probability  of  a runaway,  as  do  other  decision- 
feedback  strategies. 
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